Learning to discount transformations as the computational goal of visual cortex
نویسندگان
چکیده
It has been long recognized that a key obstacle to achieving human-level object recognition performance is the problem of invariance [10]. The human visual system excels at factoring out the image transformations that distort object appearance under natural conditions. Models with a cortex-inspired architecture such as HMAX [9, 13] as well as nonbiological convolutional neural networks [5] are invariant to translation (and in some cases scaling) by virtue of their wiring. The transformations to which this approach has been applied so far are generic transformations; a single example image of any object contains all the information needed to synthesize a new image of the tranformed object [15]. In a setting in which transformation invariance must be learned from visual experience (such as for a newborn human baby), we have shown that it is possible to learn from little visual experince how to be invariant to the translation of any object [7]. The same argument applies to all generic transformations.
منابع مشابه
The computational magic of the ventral stream: sketch of a theory (and why some deep architectures work)
This paper explores the theoretical consequences of a simple assumption: the computational goal of the feedforward path in the ventral stream – from V1, V2, V4 and to IT – is to discount image transformations, after learning them during development. Part I assumes that a basic neural operation consists of dot products between input vectors and synaptic weights – which can be modified by learnin...
متن کاملThe computational magic of the ventral stream : sketch of a theory ( and why some deep architectures work ) . December 30 , 2012 DRAFT
This paper explores the theoretical consequences of a simple assumption: the computational goal of the feedforward path in the ventral stream – from V1, V2, V4 and to IT – is to discount image transformations, after learning them during development. Part I assumes that a basic neural operation consists of dot products between input vectors and synaptic weights – which can be modified by learnin...
متن کاملComputational Principles of Cortical Representation and Development
In making sense of the world, brains actively reformat noisy and complex incoming sensory data to better serve the organism’s behavioral needs. In vision, retinal input is transformed into rich object-centric scenes; in audition, sound waves are transformed into words and sentences. As a computational neuroscientist with a background in applied mathematics, my basic goal is to reverse-engineer ...
متن کاملUnsupervised learning of invariant representations with low sample complexity: the magic of sensory cortex or a new framework for machine learning?
The present phase of Machine Learning is characterized by supervised learning algorithms relying on large sets of labeled examples (n→∞). The next phase is likely to focus on algorithms capable of learning from very few labeled examples (n → 1), like humans seem able to do. We propose an approach to this problem and describe the underlying theory, based on the unsupervised, automatic learning o...
متن کاملUnsupervised Learning of Invariant Representations in Hierarchical Architectures
The present phase of Machine Learning is characterized by supervised learning algorithms relying on large sets of labeled examples (n→∞). The next phase is likely to focus on algorithms capable of learning from very few labeled examples (n → 1), like humans seem able to do. We propose an approach to this problem and describe the underlying theory, based on the unsupervised, automatic learning o...
متن کامل